Search Result

Select

Prediction of NOx emission from fluid catalytic cracking unit based on ensemble empirical mode decomposition and long short-term memory network

Chong CHEN, Zhu YAN, Jixuan ZHAO, Wei HE, Huaqing LIANG

Journal of Computer Applications 2022, 42 (3): 791-796. DOI: 10.11772/j.issn.1001-9081.2021040787

Abstract （219）

HTML （4）

PDF （1269KB）（92）

Save

Nitrogen oxide （NOx） is one of the main pollutants in the regenerated flue gas of Fluid Catalytic Cracking （FCC） unit. Accurate prediction of NOx emission can effectively avoid the occurrence of pollution events in refinery enterprises. Because of the non-stationarity， nonlinearity and long-memory characteristics of pollutant emission data， a new hybrid model incorporating Ensemble Empirical Mode Decomposition （EEMD） and Long Short-Term Memory network （LSTM） was proposed to improve the prediction accuracy of pollutant emission concentration. The NOx emission concentration data was first decomposed into several Intrinsic Mode Functions （IMFs） and a residual by using the EEMD model. According to the correlation analysis between the IMF sub-sequences and the original data， the IMF sub-sequences with low correlation were eliminated， which could effectively reduce the noise in the original data. The IMFs could be divided into high and low frequency sequences， which were respectively trained in the LSTM networks with different depths. The final NOx concentration prediction results were reconstructed by the predicted results of each sub-sequences. Compared with the performance of LSTM in the NOx emission prediction of FCC unit， the Mean Square Error （MSE）， Mean Absolute Error （MAE） were reduced by 46.7%， 45.9%，and determination coefficient （R²） of EEMD-LSTM was improved by 43% respectively， which means the proposed model achieves higher prediction accuracy.

Table and Figures | Reference | Related Articles | Metrics

Select

Xgboost algorithm optimization based on gradient distribution harmonized strategy

LI Hao, ZHU Yan

Journal of Computer Applications 2020, 40 (6): 1633-1637. DOI: 10.11772/j.issn.1001-9081.2019101878

Abstract （514）

PDF （515KB）（400）

Save

In order to solve the problem of low detection rate of minority class by ensemble learning model eXtreme gradient boosting （Xgboost） in the binary classification problem, an improved Xgboost algorithm based on gradient distribution harmonized strategy called Loss Contribution Gradient Harmonized Algorithm (LCGHA)-Xgboost was proposed. Firstly, Loss Contribution (LC) was defined to simulate the losses of the samples in Xgboost algorithm. Secondly, by defining Loss Contribution Density (LCD), the difficulty of samples being correctly classified in Xgboost algorithm was measured. Finally, a gradient distribution harmonized algorithm called LCGHA was proposed to dynamically adjust the one order gradient distribution of samples according to the LCD. In the algorithm, the losses of hard samples (mainly in minority class) were indirectly increased, and the losses of easy samples (mainly in majority class) were indirectly reduced, making Xgboost algorithm tend to learn the hard samples. The experimental results show that compared with three ensemble learning algorithms Xgboost, GBDT (Gradient Boosting Decision Tree) and Random_Forest, LCGHA-Xgboost has the recall increased by 5.4% - 16.7%, and Area Under the Curve (AUC) improved by 0.94% - 7.41% on multiple UCI datasets, and the Recall increased by 44.4% - 383.3%, and AUC improved by 5.8% - 35.6% on WebSpam-UK2007 and DC2010 datasets. LCGHA-Xgboost can effectively improve the classification and detection ability for minority class, and reduce the classification error rate of minority class.

Reference | Related Articles | Metrics

Select

Semi-exponential gradient strategy and empirical analysis for online portfolio selection

WU Wanting, ZHU Yan, HUANG Dingjiang

Journal of Computer Applications 2019, 39 (8): 2462-2467. DOI: 10.11772/j.issn.1001-9081.2018122588

Abstract （493）

PDF （935KB）（214）

Save

Since the high frequency asset allocation adjustment of traditional portfolio strategies in each investment period results in high transaction costs and poor final returns, a Semi-Exponential Gradient portfolio (SEG) strategy based on machine learning and online learning was proposed. Firstly, the SEG strategy model was established by adjusting the portfolio only in the initial period of each segmentation of the investment period and not trading in the rest of the time, then a objective function was constructed by combining income and loss. Secondly, the closed-form solution of the portfolio iterative updating was solved by using the factor graph algorithm, and the theorem and its proof of the upper bound on the cumulative loss of assets accumulated were given, guaranteeing the return performance of the strategy theoretically. The experiments were performed on several datasets such as the New York Stock Exchange. Experimental results show that the proposed strategy can still maintain a high return even with the existence of transaction costs, confirming the insensitivity of this strategy to transaction costs.

Reference | Related Articles | Metrics

Select

Imbalanced image classification approach based on convolution neural network and cost-sensitivity

TAN Jiefan, ZHU Yan, CHEN Tung-shou, CHANG Chin-chen

Journal of Computer Applications 2018, 38 (7): 1862-1865. DOI: 10.11772/j.issn.1001-9081.2018010152

Abstract （869）

PDF （804KB）（493）

Save

Focusing on the issues that the recall of minority class is low, the cost of classification is high and manual feature selection costs too much in imbalanced image classification, an imbalanced image classification approach based on Triplet-sampling Convolutional Neural Network (Triplet-sampling CNN) and Cost-Sensitive Support Vector Machine (CSSVM), called Triplet-CSSVM, was proposed. This method had two parts:feature learning and cost sensitive classification. Firstly, the coding method which mapped images to a Euclidean space end-to-end was learned by the CNN which used Triplet loss as loss function. Then, the dataset was rescaled by sampling method to balance the distribution. At last, the best classification result with the minimum cost was obtained by CSSVM classification algorithm which assigned different cost factors to different classes. Experiments with the portrait dataset FaceScrub on the deep learning framework Caffe were conducted. And the experimental results show that the precision is increased by 31 percentage points and the recall of the proposed method is increased by 71 percentage points compared with VGGNet-SVM (Visual Geometry Group Net-Support Vector Machine) in the condition of 1:3 imbalanced rate.

Reference | Related Articles | Metrics

Select

Optimum feature selection based on genetic algorithm under Web spam detection

WANG Jiaqing, ZHU Yan, CHEN Tung-shou, CHANG Chin-chen

Journal of Computer Applications 2018, 38 (1): 295-299. DOI: 10.11772/j.issn.1001-9081.2017061560

Abstract （413）

PDF （807KB）（303）

Save

Focusing on the issue that features used in Web spam detection are always high-dimensional and redundant, an Improved Feature Selection method Based on Information Gain and Genetic Algorithm (IFS-BIGGA) was proposed. Firstly, the priorities of features were ranked by Information Gain (IG), and dynamic threshold was set to get rid of redundant features. Secondly, the function of chromosome encoding was modified and the selection operator was improved in Genetic Algorithm (GA). After that, the Area Under receiver operating Characteristic (AUC) of Random Forest (RF) classifier was utilized as the fitness function to pick up the features with high degree of identification. Finally, the Optimal Minimum Feature Set (OMFS) was obtained by increasing the experimental iteration to avoid the randomness of the proposed algorithm. The experimental results show that OMFS, compared to the high-dimensional feature set, although the AUC under RF is decreased by 2%, the True Positive Rate (TPR) is increased by 21% and the feature dimension is reduced by 92%. And the average detecting time is decreased by 83%; moreover, by comparing to the Traditional GA (TGA) and Imperialist Competitive Algorithm (ICA), the F1 score under Bayes Net (BN) is increased by 4.2% and 3.5% respectively. The experimental results that the IFS-BIGGA can effectively reduce the dimension of features, which means it can effectively reduce the calculation cost, improves the detection efficiency in the actual Web spam detection inspection project.

Reference | Related Articles | Metrics

Select

Combining topic similarity with link weight for Web spam ranking detection

WEI Sha, ZHU Yan

Journal of Computer Applications 2016, 36 (3): 735-739. DOI: 10.11772/j.issn.1001-9081.2016.03.735

Abstract （433）

PDF （737KB）（327）

Save

Focused on the issue that good-to-bad links in the Web degrade the detection performance of ranking algorithms (e.g. Anti-TrustRank), a distrust ranking algorithm—Topic Link Distrust Rank (TLDR) by combining topic similarity with link weight to adjust the propagation was proposed. Firstly, the topic distribution of all the pages was gotten by Latent Dirichlet Allocation (LDA), and the topic similarity of linked pages was computed. Secondly, link weight was computed according to the Web graph, and was combined with topic similarity to achieve the topic-link weight matrix. Then, the Anti-TrustRank and Weighted Anti-TrustRank (WATR) algorithm were improved by measuring the distrust scores correctly based on the topic and link weight. Finally, all the pages were ranked according to their distrust scores, and spam pages were detected by taking a threshold. The experiment results on the dataset WEBSPAM-UK2007 show that, compared with Anti-TrustRank and WATR, SpamFactor of TLDR is raised by 45% and 23.7%, F1-measure (threshold was 600) is improved by 3.4 percentage points and 0.5 percentage points, and spam ration(top 3 of the buckets) is increased by 15 percentage points and 10 percentage points, respectively.

Reference | Related Articles | Metrics

Select

Implementation of distributed index in cluster environment

WENG Haixing, GONG Xueqing, ZHU Yanchao, HU Hualiang

Journal of Computer Applications 2016, 36 (1): 1-7. DOI: 10.11772/j.issn.1001-9081.2016.01.0001

Abstract （897）

PDF （1303KB）（708）

Save

For performance issues brought by using non-primary key to access data on a distributed storage system, key technologies were mainly discussed to the implementation of indexing on a distributed storage system. Based on the rich analysis of new distributed storage features, the keys to design and implementation of distributed index were presented. By combining characteristics of distributed storage system and associated indexing technologies, the organization and maintenance of index, data concurrency and other issues were described. Then, the distributed indexing mechanism on the open source version of OceanBase, which is a distributed database system, was designed and implemented. The performance tests were run on the benchmarking tool YCSB. The experimental results show that the distributed auxiliary index will degrade the system performance, but it can be controlled within 5% under different data scale because of the consideration of system features and storage characteristics. In addition, it can increase index performance by even 100% with a redundant colume way.

Reference | Related Articles | Metrics

Select

Noise-suppression method for flicker pixels in dynamic outdoor scenes based on ViBe

ZHOU Xiao, ZHAO Feng, ZHU Yanlin

Journal of Computer Applications 2015, 35 (6): 1739-1743. DOI: 10.11772/j.issn.1001-9081.2015.06.1739

Abstract （675）

PDF （950KB）（424）

Save

Visual Background extractor (ViBe)model for moving target detection cannot avoid interference caused by irregular flicker pixels noise in dynamic outdoor scenes. In order to solve the issue, a flicker pixels noise-suppression method based on ViBe model algorithm was proposed. In the initial stage of background model, a fixed standard deviation of background model samples was used as the threshold value to limit the range of background model samples and get suitable background model samples for each pixel. In the foreground detection stage, an adaptive detection threshold was applied to improve the accuracy of detection result. Edge inhibition of image edge background pixels was executed to avoid error background sample values updating to the background model in the background model update process. On the basis of above, morphological operation was added to fix connected components to get more complete foreground images. Finally, the proposed method was compared with the original ViBe algorithm and the ViBe's improvement with morphology post-processing on the results of multiple video sequences. The experimental results show that the flicker pixels noise-suppression method can suppress flicker pixels noise effectively and get more accurate results.

Reference | Related Articles | Metrics

Select

Automatic brain extraction method based on hybrid level set model

AO Qian ZHU Yanping JIANG Shaofeng

Journal of Computer Applications 2013, 33 (07): 2014-2017. DOI: 10.11772/j.issn.1001-9081.2013.07.2014

Abstract （695）

PDF （635KB）（450）

Save

Automatic extraction of brain is an important step in the preprocessing of brain internal analysis. To improve the extraction result, a modified Brain Extraction Tool (BET) and hybrid level set model based method for automatic brain extraction was proposed. The first step of the proposed method was obtaining rough brain boundary with the improved BET algorithm. Then the morphological expansion was operated on the rough brain boundary to initialize the Region of Interest (ROI) where the hybrid active contour model was defined to obtain a new contour. The ROI and the new contour were iteratively replaced until the accurate brain boundary was achieved. Seven Magnetic Resonance Imaging (MRI) volumes from Internet Brain Segmentation Repository (IBSR) website were used in the experiment. The proposed method achieved low average total misclassification ratio of 7.89%. The experimental results show the proposed method is effective and feasible.

Reference | Related Articles | Metrics

Select

Smart rail transportation-an implementation of deeper intelligence

YANG Yan ZHU Yan DAI Qi LI Tian-rui

Journal of Computer Applications 2012, 32 (05): 1205-1207.

Abstract （1602）

PDF （2286KB）（1174）

Save

Traveling by rail has become one of the important transportation modes of current residents. The core of Smart Rail Transportation (SRT) is to change the existing modes of railway transportation by a more intelligent way through modern information technology. Its aim is to bring more efficient, safe, comfortable intelligent transportation systems for human activities. This paper discussed four steps of a deeper wisdom in SRT, including wisdom data collection, wisdom data fusion, wisdom data mining and wisdom decision-making. These four steps form a spiral ascendance of intelligent information processing, and ultimately achieve a deeper wisdom in SRT.

Reference | Related Articles | Metrics

Select

Hybrid BitTorrent traffic detection

LI Lin-qing YANG Zhe ZHU Yan-qin

Journal of Computer Applications 2011, 31 (12): 3210-3214.

Abstract （820）

PDF （788KB）（593）

Save

Peer-to-peer (P2P) applications generate a large volume of traffic and seriously affect quality of normal network services. Accurate and real-time identification of P2P traffic is important for network management. A hybrid approach consists of three sub-methods was proposed to identify BitTorrent (BT) traffic. It applied application signatures to identify unencrypted traffic. And for those encrypted flows, message-based method according to the features of the message stream encryption (MSE) protocol was proposed. And a pre-identification method based on signaling analysis was applied to predict BT flows and distinguish them even at the first packet with SYN flag. And some modified Vuze clients were used to label BT traffic in real traffic traces, which made high accuracy benchmark datasets to evaluate the hybrid approach. The results illustrate its effectiveness, especially for those un- or semi- established flows, which have no obvious signatures or flow statistics.